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Abstract — Throughput and per-packet delay can present strong 
trade-offs that are important in the cases of delay sensitive 
applications. We investigate such trade-offs using a random linear 
network coding scheme for one or more receivers in single hop 
wireless packet erasure broadcast channels. We capture the delay 
sensitivities across different types of network applications using 
a class of delay metrics based on the norms of packet arrival 
times. With these delay metrics, we establish a unified framework 
to characterize the rate and delay requirements of applications 
and optimize system parameters. In the single receiver case, 
we demonstrate the trade-off between average packet delay, 
which we view as the inverse of throughput, and maximum 
ordered inter-arrival delay for various system parameters. For a 
single broadcast channel with multiple receivers having different 
delay constraints and feedback delays, we jointly optimize the 
coding parameters and time-division scheduling parameters at 
the transmitters. We formulate the optimization problem as a 
Generalized Geometric Program (GGP). This approach allows 
the transmitters to adjust adaptively the coding and scheduling 
parameters for efficient allocation of network resources under 
varying delay constraints. In the case where the receivers are 
served by multiple non-interfering wireless broadcast channels, 
the same optimization problem is formulated as a Signomial Pro- 
gram, which is NP-hard in general. We provide approximation 
methods using successive formulation of geometric programs and 
show the convergence of approximations. 

Index Terms — Network Coding, Delay, Throughput, Optimiza- 
tion, Geometric Programming 

I. Introduction 

The growing diversity of network applications, protocols 
and architectures poses new problems related to the funda- 
mental trade-offs between throughput and delay in commu- 
nications. For instance, applications like file downloading or 
FTP protocols aim solely to maximize transmission rate and 
to minimize the overall completion time. On the other hand, 
applications such as real-time video conferencing are highly 
sensitive to delay of any consecutive packets. Failure to meet 
continuous delivery deadlines in stream of packets quickly 
deteriorates the Quality of user Experience (QoE). The two 
extremes in delay sensitivities by no means represent all types 
of applications. Progressive downloading video, for example, 
would be more delay sensitive than file downloading, but less 
sensitive than real-time video streaming, since the receiver has 
buffered sufficient content. 

In this paper, we develop a unified framework to study rate 



and delay trade-offs of coding and scheduling schemes and 
to optimize their performance for applications with different 
delay sensitivities. We use a class of delay metrics based on 
the -norms of the packet arrivals times to represent delay- 
rate characteristics and requirements of applications. At one 
extreme, the delay metric could capture the average delay and 
thus the rate of transmission. At the other extreme, the metric 
measures the maximum ordered inter- arrival delay. Based on 
the delay metrics, we look to optimize coding and scheduling 
parameters in a networking system, where various devices with 
different delay requirements are served by single-hop wireless 
erasure broadcast channels, each associated with an access 
point (AP). 

The coding scheme in this paper is a variation of the 
generation-based random linear network coding, presented in 
[IJ and [2J. Specifically, the sender maintains a coding bucket 
for each receivers. When a transmitter is ready to send a packet 
to some receiver, it reads the all the packet in the coding 
bucket for the receiver and produces an encoded packet by 
forming a random linear combination of all the packets in 
the coding bucket. The encoded packet is then broadcasted 
to all the receivers. Once a receiver collects enough packets 
to decode all packets in the coding bucket through Gaussian 
elimination, it uses a separate feedback channel to send an 
ACK message back to the sender. The sender always receives 
the ACK message after a certain delay. It then purges all the 
packets in the coding bucket and moves new packets into the 
bucket. The respective delay constraints of the receivers are 
known to the sender, who determines adaptively the number 
of packets to put in the coding buckets for each receiver, 
by solving system- wise optimization problems. A precise 
description of the transmission scheme is given in Section [III 
The coding buckets act as the Head of Line (HOL) generations 
in the most generation based scheme. However, unlike most 
generation-based schemes, packets are not partitioned prior to 
transmission and the bucket sizes in our scheme may vary 
over time and across different receivers, depending on each 
receiver's changing delay constraints. The coding parameters 
are optimized jointly with time division resource allocation 
parameters to exploit the trade-offs between rate and delay. 
We first illustrate the trade-offs in the case of point-to-point 
erasure channels. Then, in the case of multiple receivers 



with one AP, we formulate the delay constrained optimization 
problem as a Generalized Geometric Program, which can 
be very efficiently solved. We compare the solutions with 
fixed generation size schemes for specific examples. Finally, 
in the case of multiple APs with non-interfering erasure 
broadcast channels, we formulate the problem as a Signomial 
Program and provide methods to approximate this non-convex 
optimization with successive GPs. 

There exists a significant amount of related literature and we 
shall only examine a incomplete set of relevant ones. Previous 
work by Walsh et al | 3 1 considers the rate and delay trade- 
off in multipath network coded networks, while [4] studies 
the related issue of rate-reliability and delay trade-off by 
constructing various network utility maximization (NUM) 
problems. The concept of network coding is introduced in 
||5l and linear network coding is extensively studied in (SI 
and fTl. Other typical rateless codes that are asymptotically 
optimal for erasure channels are seen in [8| ||9l ifTOl . However, 
unlike linear network codes which allow intermediate nodes 
to recode packets, the class of fountain codes are generally 
only used for one-hop communication systems, as the packets 
can not be recoded due to stringent packet degree distribution 
requirements. In our system, the delay constraints make it 
difficult to apply fountain codes efficiently, as the asymptotic 
optimality is only achieved with coding over relatively large 
number of packets. On the other hand, we have feedback 
which will allow us to dynamically change the coding pa- 
rameters. The network coding gain in overall delay of file 
downloading with multicast over packet erasure broadcast 
channel is characterized in [11 J and [12J. With the use of 
similar linear network codes, broadcast coding schemes based 
on perfect immediate feedback are proposed and their delay 
characteristics are analyzed in 1 13 1141 |[T5l . An analysis of 
random linear codes with finite block size is given in 1161. 

The remainder of the paper is organized as follows. Section 
ini introduces our model, the code and delay metrics, as well 
as how the metrics apply specifically to the coding scheme. 
Section [III gives a concise primer on Geometric Programming 
(GP), which is the basic tool for solving our optimization 
program. Section |IV] considers a single wireless broadcast 
channel with packet erasures. We construct a joint optimization 
program, which is solved using GP techniques. Furthermore 
we illustrate the delay and throughput trade-offs with different 
system parameters and compare the solutions with fixed gener- 
ation size schemes. SectionlVl extends on the results to multiple 
non-interfering wireless channels. We provide approximation 
algorithms to the non-convex optimization problem in this 
case. Section [Vll concludes the paper. 

II. Delay Metrics and Coding 

A. Adaptive Linear Coding Scheme 

Consider a point-to-point communication system illustrated 
in Figure |l(a)| The sender (Tx) and the receiver (Rx) are 
connected by a wireless erasure channel with packet erasure 
probability e and a perfect feedback channel with delay D. The 
sender looks to transmit to the receiver a flow / consisting of 



N packets. The packets are denoted as {P/ , • • • , Pj^}. Each 
of them is treated as a length m vector in the space IF^, 
over some finite field ¥q. All N data packets are assumed 
to be available at the sender prior to any transmissions. In 
a fixed generation-based linear network coding scheme, the 
sender chooses an integer K > 1, and sequentially partition 
the N packets into [^] generations {G{, • • • , G^jv^ where 

G{ = {P/k+v • • • . ^min((i+i)K,Ar)}- cach time slot t, the 
sender reads the head of the line (HOL) generation G{ = 
{P^^, • • • , P/^}, where h = 1, • • • , \^] is the generation 
index, and /i/c, /c = 1, • • • , are the indices of packets within 
the generation. It then generates a coded packet P[t] that is a 
linear combination of all packets in G{ (shown in Figure O, 
i.e. 

K 

P[t] = Y.ak[t]Pl^, (1) 

where a[t] = (ai[t],--- ^aK[t]) is the coding coefficient 
vector, which is uniformly chosen at random from ¥^ |1|. 
The coded packet, with the coefficient vector appended in the 
header, is then sent to the receiver through the erasure channel. 
The receiver collects coded packets over time. Given a large 
enough field F^, the receiver, with high probability Q, is able 
to decode the K packets in the generation through Gaussian 
elimination performed on the linear system formed on any K 
coded packets. Once the receiver decodes the HOL generation 
successfully, it sends an ACK message through the feedback 
channel to the sender. The sender, who receives the ACK after 
a delay of D time slots, will purge the old HOL generation 
and move on to the next generation in the line. 

Our scheme modifies such generation-based network coding 
in the following ways. The packets are not partitioned into 
generations prior to transmission. Instead, a coding bucket 
is created and acts like the HOL generation. We use the 
term bucket to avoid confusion with normal generation-based 
schemes. The size of the bucket in term of number of packets is 
denoted as K. The sender collects information about user-end 
delay constraints and chooses the bucket size K dynamically. 
Figure |l(b)| gives a simple example. At the beginning, the 
coding bucket contains three packets {Pi, P2, P3}. The sender 
keeps transmitting encoded packets, i.e. P[l] to P[5], of these 
three packets. Upon receiving the ACK feedback, it empties 
the bucket and decides to shrink bucket size to 2, possibly 
because of the tighter delay constraint experienced at the 
receiver. Therefore, only two packets {P4,P5} go into the 
bucket for subsequent transmissions. We leave the details of 
adaptively determining coding bucket size to Section [IVl 

B. £p-Norm Delay Metrics 

Now we define the delay metrics used in the paper. Fol- 
lowing the notations used in the previous part, let Ti be the 
time slot in which the packet P/ is decoded at the receiver, 
and is delivered to upper layer. We require the delivery of 
original data packet {P/,--- ,Pjv} to be in order. In the 
case when the sequence of packet decoded is out-of-order. 
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(a) Adaptive linear network coding based transmission model (b) A session with varying coding bucket size 

Fig. 1. Adaptive Linear Coding Scheme in Point-to-Point Case 
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Fig. 2. An illustration of encoding 

we assume that they are buffered at the receiver to ensure in- 
order final deHvery. Ti represents the final in-order delivery 
times of packets P/ , and we have Ti < T2 < • • • < Tat. We 
define the inter- arrival times ATi of the original packets to be: 

ATi = Ti + D (2) 
AT, ^T,-T,_i, z = 2,--- ,7V, (3) 

where D is the feedback delay from the receiver to the sender. 
Note that a feedback message ACK is always assumed to be 
received correctly after D time slots. However, when there is 
more than one receiver, we assume that, in general, receivers 
experience different feedback delays across the system owing 
to its location and channel variations. Let the size of each data 
packet be L. We define the delay cost function as a metric of 
the following form, 

4 ,,,,^), ,4, 

where E[ATj] is the expected value of AT,. The expectation 
is taken over the distribution of packet erasures over the 
system and all the randomness associated with the coding and 
scheduling scheme, which are specified in Section HV] 

Mathematically, the delay metric is a normalized ^^-norm of 
the vector [E[ATi] • • • E[ATAr]]^. Physically, however, p mea- 



sures the delay sensitivity of the receiver and is predominantly 
dependent on the type of applications running on the receiver. 
As the value of p varies from 1 to 00, the delay function 
becomes increasingly biased towards the large components in 
the vector, hence indicating increasing user sensitivity toward 
large inter-packet delay. As an example, consider the case 
when p = 1. Since Y^f^^ E[ATi] = E[TAr] + D, the delay 
in simplifies to, 

E[rjv] + D 



dil) 



LN 



(5) 



that is, d{l) is the average delay per packet, normalized by 
the total size of the received data. Minimizing d{l), therefore, 
is equivalent to average rate maximization for the receiver. On 
the other hand, consider the case when p = 00. Because of 
the max norm, the delay function in ^ reduces to, 

maxi E[ATi] 



d{oo) — 



(6) 



Effectively, minimizing d{oo) translates into minimizing the 
maximum expected inter-arrival time between any two suc- 
cessive packets. We call this the per-packet delay. 

The flexibility in choosing various p- value for delay metrics 
provides a unified way of looking at the delay sensitivity at the 
user side. If a user is downloading a file, he is certainly more 
concerned about shortening the overall completion time or 
average delay per packet. Consequently, is the appropriate 
delay metric to be optimized. On the other hand, if the 
user is running a real-time video applications, then d{oo) 
is more likely to be the right metric to be minimized as it 
allows sequence of packets to catch up quickly with respective 
delivery deadlines. 

C. Delay In Adaptive Coding Scheme 

In the adaptive coding scheme, a receiver will decode all 
packets in the current bucket before informing the sender to 
empty the bucket and move in new packets. Assume that the 
rate at which the coded packets are transmitted is r. Consider 



the transmission of a bucket of K packets {Pj^,--- 
Once the receiver collects K linearly independent coded 
packets of the bucket, it decodes all K packets together. Hence, 
the ordered inter-arrival times of original packets will satisfy, 
E[AT^ J = f + I) and AT,, = • • • AT,^ = 0. In general, 
consider the case when the bucket size remains the same for 
a sequence of N packets, {P^i, • • • , Pi^}- N is divisible by 
K, as the bucket size may only change when the bucket is 
emptied. The packets will sequentially enter the bucket in 
groups of K packets. Then, for the inter- arrival time of the 
j-th packet, we have. 



be written in the form. 



E[AT,^ 



I 0^ 



D, if j = 1 (modiC), 
otherwise. 



(7) 



Therefore, if the adaptive scheme chooses bucket size of K 
of a sequence of N packets, we can simplify (|4]) to measure 
the delay cost function for the transmission of the TV packets, 
resulting in: 



d{p) = 



1 ( §Ef=limT^,]r 

L \ N 
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(8) 

(9) 
(10) 



In particular, under this coding scheme, the delay d{p) seen 
by the receiver over the period is independent of N as long 
as the coding bucket size remains to be K. Hence, we drop 
N and only consider the bucket size K for rest of the paper. 
Furthermore, in practice, K takes only positive integer values 
in [l^Kmax], where Kmax is the maximum bucket size, 
limited by the maximum tolerable computation complexity of 
the target system. In this work, for simplicity, we assume that 
K takes on real value in the same region [1, Kmax]- 

III. Geometric Programming 

We give a concise primer of Geometric Programming be- 
fore looking specifically into our system model. For more 
comprehensive coverage of the topic, we refer the reader to 
ifTTl , iflS I. Geometric program (GP) is a class of mathematical 
optimization problems characterized by some special forms of 
objective functions and constraints. A typical GP is nonlinear 
and non-convex, but can be converted into a convex program 
so that a local optimum is also a global optimum. The theory 
of GP has been well studied since the 60s 1 19 |. Well developed 
solution techniques, such as interior point methods are capable 
of solving GPs efficiently even for large scale problems. Many 
high-quality GP solvers are available (e.g. MOSEK and CVX 
||20|) for providing robust numerical solutions for generalized 
GPs (GGP). 

Consider a vector of decision variables x = [xi . . . Xn]^. A 
real function g : ^ R is said to be a monomial if it can 



g{^) = cl[xr 



(11) 



where the coefficient c is positive, and the exponents 
ai, . . . , ttn are arbitrary real numbers. A function / : R^ R 
in the form 



K 



/(x)=^c,n^ 



(12) 



k=l 



with all Ck being positive real numbers, is called a posynomial. 
A posynomial is the sum of arbitrary number of monomials. 
On top of this, any function /, which can be constructed with 
posynomials using addition, multiplication, positive power and 
maximum operations is called a generalized posynomial. 
A standard form geometric program is presented as follows. 



minimize /o (x) 

subject to /^(x) < 1, z = 1, . . . , m, 



(13) 



where fi{x) are posynomials and gi{x) are monomials, and Xi 
are the decision variables, which are also implicitly assumed 
to be positive, i.e. Xi > 0, i = l,...,n. In particular, 
the objective of the optimization has to be minimizing some 
posynomial. That says, for solving maximization problems 
with GP, the objective function has to be in the form of 
some monomial ^(x), so that instead of maximizing ^(x), 
we can minimize -4-^, which is itself a monomial. In the case 
where any fi (x) is a generalized posynomial, the optimization 
program is said to be a generalized geometric program (GGP). 
All generalized geometric programs can be converted into 
standard geometric programs and solved efficiently. 

Note that a GP in its standard form is non-convex, as in 
general posynomials are non-convex. In order to apply general 
convex optimization methods, a GP is usually transformed into 
its convex form. Let yi = log Xi so that Xi = e^' , the standard 
form GP can be transformed into its equivalent convex form, 

minimize log /o (e^ ) 

subject to log fi{e^) <1, i = l,...,m, (14) 
In particular, a monomial constraints 



k=l 

is converted to 

n 

log gjie^) = log dj + ^ a^/e yk = 0, (15) 

k=l 

which is affine and convex. On the other hand, the posynomial 
parts are converted into log-sum-exp functions, which can be 
easily shown to be convex. Therefore, although the original 
standard formulations of GPs are nonlinear and non-convex, 
they can be converted into convex form as in ([T4l) and solved 



efficiently. In this paper, we use GP to optimize the coding 
parameters and resource allocation at the transmitter with 
respect to the ^^-norm delay metric defined previously. 

IV. Single Broadcast Channel With Packet 
Erasures 

A. System Model 

The motivating scenario of the work comes from a typical 
home network environment with multiple user networking 
devices. The receivers are wirelessly connected to a WiFi 
access point (AP), which is then linked with the gateway to the 
Internet. All the flow of packets from the Internet to the user 
devices goes through the gateway and the access point. The 
applications running on different devices have very different 
delay sensitivities and constraints, as discussed before. The 
gateway and the access point look for the optimal coding and 
scheduling parameters to ensure the QoE of all the users within 
the network. 

Conceptually, we represent the system using the following 
model. We assume that the link between the AP and gateway 
has a high capacity and is lossless. Thus, we represent both 
the gateway and the AP together as a single node s. We denote 
the set of receivers by T = {ti, • • • , ^m}- Each receiver needs 
to obtain a flow of packets from some source over the Internet. 
Let J-" = {/i, • • • , /m} be the set of flows, where fi is the 
packet flow requested by receiver ti . Note that all fi enter the 
system from node s, which in turn acts as a source node. The 
flows for different receivers are assumed to be independent. 
The original data packets in each flow are numbered, with 
Pj' representing the j-th packet in flow fi. We assume that 
there are always enough packets to be served for each flow, 
since that is the case when there is a heavy traffic condition. 
Furthermore, all packets are assumed to have the same size L 
in the system and the system is time slotted. At any time slot, 
the node s is able to broadcast a size L packet to all receivers, 
through the packet erasure broadcast channel. Erasures happen 
independently across all receivers and all time slots, i.e. the 
channel is memory less. We denote the erasure vector by e = 
[si • • • Em], where £i represents the erasure probability seen by 
receiver tj . Figure [3] gives an illustration of the system model 
in the discussion. 

Coding Buckets 




Fig. 3. System Model with Single Transmitter 



1 ) Scheduling Strategies: Most of the works we discussed 
in Section U focus on linear network codes for multicast, in 
which all the receivers request the same content from the 
sources. In the system we consider here, however, we have 
a multiple unicast scenario, as each sink looks to receive 
its own flow, independently from others. The resource at 
node s has to be shared among all the receivers. Specifically, 
for every time slot, the sender s has to make a decision 
on which receiver to transmit to. While many sophisticated 
scheduling algorithms are available, for simplicity, we use 
a simple stochastic scheduling algorithm. At any time slot, 
the node s serves receiver tj or flow fj with probability aj, 
independently from of any other time slots. In the long run, 
equivalently, the transmitter node s is spending aj portion of 
time serving receiver tj. We call the vector a = (ai, • • • , gm) 
the vector of scheduling coefficients. 

2) Intra-session Coding: The adaptive coding scheme de- 
scribed in Section JI] is used in the system. In the multiple 
receiver case, we use intra-flow coding, i.e. each unicast 
flow is coded independently and separately from others. The 
coding bucket sizes and scheduling coefficients, however, are 
determined by solving system- wise optimizations. In this case, 
for a given time slot, if the transmitter decides to serve receiver 
tj, it looks for packets in the coding bucket of flow fj, and 
encodes these packets using random linear network codes. 
The coded packet is broadcasted to all the receivers. With 
probability 1 — Sj, the targeted receiver tj will receive it 
correctly. Note that we assume the coding coefficients are 
embedded in the header of the packet and the size is negligible 
compared to the size of the packet L. The coding bucket 
size for flow fj is denoted as Kj. In general, Ki ^ Kj for 
i ^ j, and Kj may vary over time as the delay requirements 
at the receivers changes. Let K = (Ki, • • • , Km)- We aim to 
optimize both a and K, based on the varying delay constraints 
at the receivers. 

B. Delay Optimization 

We first consider the case where there is only a single 
receiver, i.e. M = 1. Since there is no scheduling issue 
or system-wise fairness consideration in the case, it makes 
sense to minimize the delay cost function associated with the 
receiver. As there is no ambiguity of notations, we drop all 
the subscripts. It is easy to see that the packet transmission 
rate in this case is 1 — e for the receiver and thus the expected 
time for receiving K coded packets is . Subsequently, the 
^p-norm delay cost function minimization problem is given as 
follows, 

minimize d(p) = ^t^, , — (16) 

subject to I < K < Kmax' (17) 

The optimal block size K* can be obtained by setting zero to 
the gradient of the Lagrangian of objective function. We have 

K*=(^l^) , 0<e<l, (18) 



where the subscript denotes the projection, 

(^)[a,5] — min(max(a, x), 6). 

For better understanding of the delay metrics, consider the 
relation between d{l) and d{oo). From Q, we have, 

D{l-s) 



K 



(19) 



(I - e)Ld(l) - l 

Hence, the trade-off between d{l) and d{oo) can be expressed 
as follows. 



d{oo) = 



D 



(20) 



Ignoring the bucket size constraints for simplicity, given D, 
we can vary K from 1 to oo, and plot the values of d{oo) 
against d{l) for the trade-off curve. Each point on the curve 
corresponds to a choice of K, which is equivalent to a choice 
of optimizing d{p) for some because of ([TS]) . Therefore, the 
choice of p at the receiver indicates the a point on the trade-off 
curve of d(l) and d{oo) that is desired by the receiver. 

We can also use the zero duality gap in GP to obtain the 
optimal d{p) directly from the dual function. Note that the 
dual function of ([T6l) is given by. 
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(21) 



_(l-e)L/3i, 

where /3 = (/3i,/32). can be obtained from solving a simple 
linear system, 




+ (- 
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■l/p)/?2 =0, 
= 1, 



(normality condition) 

(orthogonality condition) 
(22) 



C. Delay Constrained Optimization with GP 

1) GP Formulation: For M > 1, instead of minimizing the 
delay of a specific receiver, we are interested in optimizing cer- 
tain system-wise utility function with the constraints that the 
^p-norm delay requirements must be satisfied at each receiver. 
We assume the each receiver tj monitors the delay constraints 
for targeted QoE of its applications and set a maximum 
acceptable delay d{pj), corresponding to its delay sensitivity 
Pj. For the objective function, we choose to maximize the 
min rate of all receivers. If the packet transmission rate to tj 



is Tj , then the actual data rate received by tj is 



-, where 



Dj is the feedback delay of tj. Let r = (ri, • • • ^vm)- The 
optimization problem is then given as follows: 



max K r a mm -rp — 



(23) 



subject to 



LK 



<dj{pj) Vj = l,...,M (24) 



Tj < aj{l-ej) 



E 



aj < 1 



l<Kj < Krr 



Vj = l,...,M (25) 
(26) 

Vj = l,...,M. (27) 



In the above formulation, constraints (I24t and (1251) represent 
the delay and rate constraints respectively for receiver tj, 
while (l26l) is the scheduling probability constraint at the sender 
node s. The problem is a Generalized Geometric Program. In 
particular, all constraints can be converted into upper bound 
of posynomials of K, r and a. The only non-posynomial part 
is the objective function, which can be transformed into upper 
bounding posynomial constraints and monomial objective by 
adding auxiliary variable x, 



max X 



subject to 



< 1, 



(28) 



Vj. (29) 



Combining this with ([24l) to ([27]), we have a GP that can be 
efficiently solved. 

D. Illustrations of Trade-offs 



Increase K', 



Fig. 4. Tradeoff of d(l) vs ci (oo) with varying D 

1 ) Trade-off: Average Delay vs Per-packet Delay: Figure |4] 
demonstrate the trade-off between d{l) and d{oo) following 
Equation ([2Q|) with various values of D and erasure probability 
£ = 0.4. As discussed previously, if we parameterize d{l) and 
d{oo) on the optimal bucket size K*, as p varies from 1 to oo, 
we obtain the same curves. The shaded area bounded by each 
curve is the area of all achievable pairs {d{l)^d{oo)) for the 
specific feedback delay. With small D, both low delay in d{l) 
and d{oo) can be achieved. However, when feedback delay 
increases, the trade-off becomes increasingly stronger. This is 
evident from Equation ([2Q|) . where D appears in the numer- 
ator. It is expected as for average delay, coding over larger 
generations amortizes the feedback delay over more packets. 
But for the per-packet delay d{oo), increased feedback delay 
must be compensated by even smaller generation size for more 
frequent decoding. This is also consistent with Equation (fTSl) 
where K* increases with feedback delay D and decreases as 
delay sensitivity p. 

2) Adaptive Scheme vs Fixed Generation Coding: Fig- 
ure [5] to [7] shows some comparisons between adaptive cod- 
ing schemes with fixed generation size coding schemes, 
as the delay sensitivity pi of the first receiver increases. 
In this example, we have 5 receivers, with erasure e = 
[0.4,0.1,0.15,0.2,0.25], the same 1) = 5, L = 1 and 
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Fig. 5. Min Rate vs Delay Sensitivity pi 
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Fig. 6. Coding Bucket Size vs Delay Sensitivity pi 



Sclieduling Coefficient a1 vs Sensitivity P 



dj = 50/ L. Except for receiver 1, whose pi value varies, we 
have pj = 1 for all other receivers. For the fixed generation 
size schemes, we choose K = 2b and K = 100 for 
representing small and large generation respectively. In this 
cases, scheduling coefficients are the only decision variables 
in the optimization for min rate. From Figures [5] and [6j we 
can see that, initially, the min rates for different schemes 
are relatively close. The adaptive scheme is able to choose 
a much larger coding bucket size to obtain some rate gain 
compared to K = 2b case. As pi increases, the fixed coding 
generation schemes are unable to reduce generation size. In 
order to meet the growingly stringent delay constraint, the 
sender has to devote increasingly more time to receiver 1, as 
seen in Figure |7l Inevitably, the time for serving other receivers 
is greatly reduced and the min rate of the system decreases 
quickly. In the K = 100 case, the delay requirements cannot 
be satisfied for pi > 2.9. On the contrary, for the adaptive 
scheme, which optimizes bucket size and scheduling jointly, 
there is little decrease in min rate. For low delay sensitive 
receivers, the scheme will assign them large coding bucket 
sizes to allow rate gain. As a results, the sender is able to 
meet their delay-rate constraints with less serving time and 
save time for higher receivers. On the other hand, as p values 
for some receiver increases, its coding bucket size is reduced 
to quickly decreases the per-packet delay. Hence, the scheme 
is able to accommodate high delay sensitive receivers much 
better. 

V. Multiple Wireless Packet Erasure Channels 

With the proliferation of low cost access points, many 
devices may be covered by more than one access points in 
wireless home, campus or enterprise networks. That leads to an 
important extension of the work to the case of multiple broad- 
cast erasure channels covering the same set of receivers. As in 
the previous section, we still have the same set of receivers, 
T = {ti^ ■ ■ ■ ^Im}- However, there are now W access points, 
or transmitters, denoted by the set S = {si, • • • , sw}- Instead 
of an erasure probability vector, we have an erasure probability 
matrix e = [eij], where Eij is the erasure probability between 
node Si and tj. An example of the system is illustrated in 
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Fig. 7. ai vs Delay Sensitivity pi 



Figure [H We assume that the channels are orthogonal or non- 
interfering. 

The same coding and scheduling scheme is used for the 
new system. We use a = [a^j] to represent the probability 
of transmitter Si serving the flow fj at any time slot. The 
scheduling and coding optimization is done at the gateway 
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Fig. 8. An example system with 2 senders and 4 receivers 



node G, who coordinates all senders who perform encoding. 
Furthermore, for each flow fj in all senders in S have 
the same coding bucket size Kj, dictated by node G. This 
ensures that, for each flow, every sender sends coded packets 
in the coding buckets consisting of the same data packets 
and guarantees the decodability. An important feature of the 
randomly linear coding is that all coded packets from the same 
bucket are exchangeable. That avoids complicated scheduling 
based on sequence numbers of the uncoded packets and helps 
to reduce transmission redundancy in erasure channels. 

A. Signomial Program Formulation 

Similarly to the single sender case, we can formulate 
an optimization program for determining K,a and r. For 
example, the rate product maximization is given as. 



min WR~^ 

K,r,a,R ^ 

subject to 
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where the delay constraint (|3T1) and complexity constraint (l34l) 
remain the same. Auxiliary variables Rj and (|33]) are used 
to represent the average rates for the receiver. Maximizing 
rate product is equivalent to minimizing the product of av- 
erage delays RJ^, hence the objective YljRj^- The packet 
transmission rate for each receiver in this case is bounded by 
aij{l — Sij). However, owing to the existence of this new 
transmission rate constraint (l32l) , the problem becomes truly 
non-convex. In particular, the constraint can be written as 



^(-a,,)(l-s,,)<0, 



(36) 



which is an upper bound constraint on a signomial. A signo- 
mial is a sum of monomials whose multiplicative coefficients 
can be either positive or negative. The problem therefore 
belongs to a more general class of problem called Signomial 
Program, which is truly non-convex and NP-hard in general. 
Only local optimal solutions can be efficiently computed. 
Based on the most widely used monomial condensation meth- 
ods, we provide an efficient way to approximate the solution 
with successive GP solutions. 

B. Successive GP Approximation 

Consider an arbitrary signomial h{x.). It can always be 
written as the difference between two posynomials, i.e. h{x.) = 
/+(x) — /~(x). The inequality /i(x) < is then equivalent 
to 



/+(x) 
/-(x) 



with a posynomial using common condensation methods ifTSl . 
In single condensation, the posynomial denominator /~(x) 
is approximated using a monomial ^~(x), which in turn 
allows f-(^J^ to be approximated by a posynomial . 
In double condensation, both /+ and /~ are approximated 
using monomials, which creates a monomial approximation of 
. In our case, both methods are equivalent, since we have 
/+ (x) = Tj , which is itself a monomial. One of the commonly 
used condensation methods is based on the following Lemma 
LL81. 

Lemma 1: Given a posynomial /(x) = Xl^^^i(x), choose 
Pi > 0, such that Pi = I, then the following bound holds. 



/(x)>5(x) = n 



M»(x) 



0i 



(37) 



Furthermore, equality holds when x = xq and Pi — 



/(xo) • 

Proof: The results can be easily proved using Inequality of 
Arithmetic and Geometric Mean (AM-GM). 

Using Lemma [TJ we can approximate constraint ([32l) in the 
signomial program with the following. 



n 



(38) 



< 1. We can approximate the left hand side 



In particular, the optimization program is then a Geometric 
Program, if we replace ([32b with ([38]). Furthermore, given the 
monomial approximation in ([38]), we can construct successive 
GP based on refined approximations of constraint ([32l) to 
approach local optimal solutions of the original Signomial 
Problem. The algorithm is summarized in Algorithm [TJ 

Algorithm 1: Successive GP Approximation of SP 

Begin: A feasible solution (K^, a^, r^, R^), t = 0; 
repeat 

Compute /(a^) = Y,. a^^.(l - eij); 

Compute Pij = ^^tI^;^; 
Construct the t-th approximation and replace 
constraint ([32l) with the monomial constraint, 

t = t + l; 

Solve the resulting GP to get (K^ a^ R^); 
until Convergence; 

C. Convergence 

Given Lemma [TJ it is easy to show the Algorithm [TJ always 
converges. According to Lemma [TJ the values Pij are chosen 
such that for the local approximation at a^ in the t-th iteration, 
we have, 

5(a*)=/(a*)>/(a*-i). (39) 

Let the optimal objective for iteration t be Z*'*. Then we 
have Z*'^ < Furthermore, at local optimal a*, it can 



be verified, that /(a*) = ^(a*) and V/(a*) = V^(a*), which 
shows that the algorithm will indeed converge to an optimal 
that satisfies the KKT condition. In fact, in many cases, it 
converges to the global optimum. Figure |9] and [TO] shows the 
convergence of min rate and bucket sizes for a example system 
with 3 receivers and 2 transmitters. 
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Fig. 9. Convergence of Optimal Min rate in the example system 
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Fig. 10. Convergence of Bucket Sizes in the example system 

VI. Conclusions 

In this paper, we consider the trade-off between rate and 
delay in single-hop packet erasure broadcast channels with 
random linear coding schemes. We characterize the delay and 
rate requirements of various users with a unified framework 
based on the ^^-norm delay metrics defined on the in-order 
packet arrival times. Using the optimal trade-off curve between 
the average delay, which can be viewed as the inverse of 
rate, and the per-packet delay, we demonstrate how feedback 
delays and the choice of coding bucket sizes affect the trade- 
offs. In the multiple receiver case, we formulate geometric 
optimization problems to exploit the trade-off together with 
the transmission time allocate at the senders. With an adaptive 
coding scheme, for low delay sensitive receiver, the sender 
could allocation less time while compensating the rate loss 
with larger coding bucket. That allows the sender to allocate 
more time to high delay sensitive receivers who, at the same 
time, are assigned with smaller coding bucket sizes. We show 
that the adaptive scheme is more robust and resilient toward 
high and varying delay sensitivities, since the feedback infor- 
mation about receiver delay constraints adds extra flexibility 



to the coding and scheduling design. In particular, in many 
systems, this comes with little cost because of the availability 
of feedback channels. Finally, when there are multiple senders, 
we formulate the same optimization problem into a non- 
convex signomial problem and approximate the solution with 
successive GP approximations based on single condensation 
methods and we demonstrate the convergence of the algorithm. 
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